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The  Entity-Relationship  Model — Toward  a 
Unified  View  of  Data 


PETER  PIN-SHAN  CHEN 
Massachusetts  Institute  of  Technology 


A  data  model,  called  the  entity-relationship  model,  is  proposed.  This  model  incorporates  some  of 
the  important  semantic  information  about  the  real  world.  A  special  diagrammatic  technique  is 
introduced  as  a  tool  for  database  design.  An  example  of  database  design  and  description  using 
the  model  and  the  diagrammatic  techniciuc  is  given  Some  implications  for  data  integrity,  infor- 
mation retrieval,  and  data  manipulation  are  discussed. 

The  entity-relationship  model  can  be  used  as  a  basis  for  unification  of  dilTcrent  views  of  data: 
the  network  model,  the  relational  model,  and  the  entity  set  model.  Semantic  ambiguities  in  lhe.se 
models  are  analyzed.  Po.ssible  ways  to  derive  their  views  of  data  from  the  entity-relationship 
model  are  presented. 
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model,  data  definition  and  manipulation,  data  integrity  atid  consistency 
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1.  INTRODUCTION 

The  logical  view  of  data  has  been  an  important  i.s.siie  in  recent  years.  Three  major 
data  models  have  been  proposed;  the  network  nnjdel  [2,  3,  7],  the  relational  model 
[8],  and  the  entity  set  model  [25].  These  models  have  their  own  strengths  and 
weaknesses.  The  network  model  provides  a  more  natural  view  of  data  by  separating 
entities  and  relationships  (to  a  certain  extent),  but  its  capabilitN-  to  acnieve  data 
independence  has  been  challenged  [8].  The  relational  model  is  based  oii  relational 
theory  and  can  achieve  a  high  degree  of  data  independence,  but  it  may  lose  some 
important  semantic  information  about  the  real  world  [12,  15,  23].  The  entity  set 
model,  which  is  based  on  set  theory,  also  achieves  a  high  degree  of  data  inde- 
pendence, but  its  viewing  of  values  such  as  "3"  or  "red"  may  not  be  natural  to 
some  people  [25]. 

This  paper  presents  the  entity-relationship  model,  which  has  most  of  the  ad- 
vantages of  the  above  three  models.  The  entity-relationship  model  adopts  the 
more  natural  view  that  the  real  world  consists  of  entities  and  relationships.   It 

Copyright  ©  1976,  Association  for  Computing  Machinery,  Inc.  General  permission  tc  republish, 
but  not  for  profit,  all  or  part  of  this  nuilerial  is  granted  provided  that  .\CM's  copyright  notice  is 
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reprinting  privileges  were  granted  by  permission  of  (he  .Association  for  Computing  Machinery. 
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Framingham,  Maas  ,  Sept.  22-24,  l!)","!. 
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ACM  Transactions  on  Datubaee  Systems,  Vul    I.  No    I,  Mnrcft  l'.17C.,  Pages  9-.3li 


073119^2 


10  •  p.  P.-S.  Chen 

iiu'Drponitos  stmio  of  the  important  semantic  information  about  tlip  real  world 
(other  work  in  database  scniantics  can  be  fouiul  in  [1,  12,  1'),  '21,  '2'.i,  and  '-"•]). 
The  model  can  achieve  a  liijih  degree  of  data  independence  and  is  based  on  set 
theory  and  relation  theor\'. 

The  entity-relationship  model  can  be  used  as  a  basis  for  a  unified  view  of  data. 
Most  work  in  the  jiast  has  emphasized  tlu'  difference  between  the  network  incjdel 
and  tlie  relati<;nal  model  [22].  Recentlj',  several  attempts  have  been  macJe  to 
reduce  the  differences  of  the  three  data  n\odels  [4,  19,  20,  HO,  31].  This  paper  u.ses 
the  entity-relationship  model  as  a  framework  from  which  t'le  three  existing  data 
models  may  be  derived.  The  reader  may  view  the  entity-relationship  model  as  a 
generalization  or  extension  of  existing  models. 

This  jiapcr  is  organized  into  three  parts  (Sections  2-4).  Section  2  introduces 
the  entity-relation.ship  model  using  a  framework  of  multilevel  views  of  data. 
Section  3  describes  the  semantic  information  in  the  model  and  its  implications  for 
data  description  and  data  manipulation.  A  special  diagrammatric  technique,  the 
entity-relationship  diagram,  is  introduced  as  a  tool  for  database  design.  Section  4 
analyzes  tlic  network  model,  the  relational  model,  and  the  entity  set  model,  and 
describes  how  they  may  be  derived  from  the  entity-relationship  model. 

2.  THE  ENTITY-RELATIONSHIP  MODEL 

2.1  Multilevel  Views  of  Data 

In  the  study  of  a  data  model,  we  should  identify  the  levels  of  logical  views  of  data 
with  which  the  model  is  concerned.  Extending  the  framework  developed  in  [IS,  25], 
we  can  identify  four  levels  of  views  of  data  (Figure  1) : 

(1)  Information  concerning  entities  and  relationships  which  exist  in  our  minds. 

(2)  Information  structure — organization  of  information  in  which  entities  and 
relationships  are  represented  by  data. 

(3)  Access-path-independent  data  structure^the  data  structures  whicli  are  not 
involved  w  ith  search  schemes,  indexing  schemes,  etc. 

(4)  Acce.s.s-path-dependent  data  structure. 

In  the  f(jllowing  sec  t ions,  we  shall  tievelop  the  entity-relationship  model  step  by 
Step  for  the  first  two  levels.  As  we  shall  see  later  in  the  paper,  the  network  model, 
as  currently  implement«'d,  is  mainly  concerned  with  level  4;  the  relational  model  is 
mainly  concerned  with  levels  3  and  2;  the  entity  set  model  ii  mainly  i(incerned 
with  levels  1  and  2. 

2.2  Information  Concerning  Entities  and  Relationships  (Level  1 ) 

At  this  level  we  consider  entities  and  relationships.  An  enlity  is  a  "thing"  which 
can  be  distinctly  identified.  A  specihc  person,  company,  or  event  is  an  example  of 
an  entity.  A  relationshii)  is  an  association  among  entities.  For  instance,  "father-son" 
is  a  relationship  between  two  "person"  entities.' 


•It  is  possible  that  some  people  tuny  view  something  (e.g.  marriage)  as  an  entity  while  other 
people  may  view  it  a-s  a  relalionsliip    We  think  that  this  is  a  decision  which  hits  to  be  made  by 
the  enterpri.se  administrator  [27].  lie  should  dotine  what  are  entities  and  what  are  relationships 
so  that  the  distinction  is  suitable  for  his  cnvironinenl. 
ACM  Transuchoiis  on  Database  Syslem.i.  Vol.  1,  No    1,  March  1970. 
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Fig.  1.  Analysis  of  data  models  using  multiple  levels  of  logiral  views 


The  database  of  an  enterprise  contains  relevant  information  concerning  entities 
and  relationships  in  which  the  enterprise  is  interested.  A  complete  description  of 
an  entity  or  relationship  may  not  be  recorded  in  the  database  of  an  enterprise. 
It  is  impossible  (and,  perhaps,  unnecessary)  to  record  every  pote.itiahy  available 
piece  of  information  about  entities  and  relationships.  From  iif)\v  on,  we  shall 
consider  only  the  entities  and  relationships  (and  the  information  concerning  them) 
which  are  to  enter  into  the  design  of  a  database. 

2.2.1  Entity  and  Entity  Set.  Let  e  denote  an  entity  which  exists  in  our  minds. 
Entities  are  classified  into  different  enllti/  nets  such  as  EMPLOYEE,  PROJECT, 
and  DEPARTMENT.  There  is  a  predicate  associated  with  each  entity  set  to  test 
whether  an  entity  belongs  to  it.  For  example,  if  we  know  an  entity  is  in  the  entity 
set  EMPLOYEE,  then  we  know  that  it  has  the  properties  common  to  the  other 
entities  in  the  entity  set  EMPLOYEE.  Among  these  properties  is  the  afore- 
mentioned test  predicate.  Let  /?,  denote  entity  sets.  Xotc  that  entity  .sets  may  not 
be  nmtually  disjoint.  For  example,  an  entity  which  belongs  to  the  entity  set  AfALE- 
PERSON'  also  belongs  to  the  entity  set  PERSON.  In  this  case,  ^LVLE-PERSON 
is  a  subset  of  PERSON. 

2.2.2  Relationshij),  Itole,  and  Relationship  Set.  Consider  a.s.sociations  among 
entities.  A  relationship  set,  R„  is  a  mathematical  relation  ['>]  among    n  entities, 
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eiH'li  taken  I'ldin  nii  entity  set; 

I  [.''I.  '■•J,  .     .  , «'»  I  I  <•,  I    /•.',,  I-..  I    /'.'■.,     .  .  ,  (■„  I     /','„!, 

aiul  eaeli  tiipli'  nl'  eiitilii'S,  [<'i,  c...,  .  .  ,  ,  c,,  ],  is  :i  iiinliiiiishiii  X'cile  tlint  (lie  /,',  in  tlie 
above  (leiinit  lull  in:i\  mil  he  distinct,  l''(ir  exainpli',  a  "niairiane"  is  a  leliit  Kmsliip 
between  two  entities  in  tlie  entity  set  I'l'lltSOX 

The  roll'  dl"  an  eiitilv  in  a  relatiiiiislii|i  is  (lie  Innclioii  tliat  it  iieMuiius  in  llie 
relationsln|i  "Illlsliaml"  ami  "wile"  are  roles  The  niilel  ln^',  nt  iiililn:;  in  llie 
(lelinitiiin  uf  relal  i<inslii|)  (note  tlial  square  Ixiukets  were  iiseili  i;in  lie  ilio|i|>e(|  il' 
roles  of  ent  It  les  in  tile  relatioiislii|i  are  expiieil  ly  slated  as  iollows  ('i, 'i,ra  i  j,  .  ,  ., 
r„   (■„),  w  llere  '',  is  (he  rule  of  r,  in  (he  i  eliitionslli|) 

'_*.'_' i  5  Attribute,  \  aliii',  and  \'aliie  Set.  The  inforiiiation  about  an  iii(i(\  or  a 
relationship  is  obtained  by  <ibservatioii  or  iiieaslireiiieiit ,  and  is  expn;  id  li\  a  m( 
of  a(  t  nbii(e-\'alne  pairs.  "I{",  "red",  "I'eter",  and  '.IoIiunoii  '  are  value:;  \iililes 
are  elassilied  into  dilTerent  inlur  xrls.  such  as  Kl'd';'!',  COl.t'lt,  h'lKST  \  \  MM, 
and  LAST  NAM  I'',  There  is  a  predicate  associated  with  each  \aliie  Mt  to  (est 
whether  a  value  belongs  to  it.  A  value  in  a  value  set  iiia\  be  ei|iiivaleiit  (o  another 
value  in  ;i  dilTeient  value  set  l''or  example,  "I.!"  in  value  sel  l\('ll  is  ei|invaleiil 
to  "  1"  in  value  set  I'l'll'l  T. 

An  (illnhuli-  can  be  formally  delined  as  a  funition  uhieli  maps  from  an  enlil\' 
set  or  a  relationship  set   into  a  v.iluc  set  or  a  ('artesian  produel   of  value  .sets. 

/:  H,  or  l{,     ►  r.or  l.,  X  V  „  X  •••  X  l',„. 

Kiniire  2  ilhist  rales  some  attributes  delined  on  entity  set  I'l'lUSt  )N.  The  ill  tribute 
A(1M  maps  into  value  set/  NO  ( )!''  ^  ivAKS,  .An  all  ribute  can  map  into  a  ( 'artesian 
product,  of  value  sets.  i'"or  example,  the  attribute  N  A  M  I'!  maps  into  value  .sets 
KIUST  NAMK,  and  I.AS'I'  NAMl'].  Note  that  more  (han  one  attribute  may  map 
from  the  same  entity  set  into  the  s.ame  value  .set  (or  same  uronii  of  value  sets). 
For  example,  NA.MI':  and  Al/l'I'lUN A  11 VI'!  NAMIO  map  from  the  entity  wt, 
KMl'LOVKK  into  value  sets  KIUHT  NAMKand  bAST  NAMK  Therefore,  alt  n 
bule  and  value  set  are  tlilTerent  concepts  allliou)(li  (hey  may  have  the  same  name 
in  some  I'ases  (for  example,  lOM  l'l,(  >N' hi!';  \()  maps  from  MM  I'bOV'iOl';  to  value 
set  I'lMI'liOVi'll'l  NO).  This  (listinclion  is  mil  clear  in  the  network  model  and  in 
many  existing  data  management  systems.  Also  note  that  an  at  tribute  is  delined  as 
11  function.  Therefore,  it  maps  a  ^iven  entity  to  ii  single  value  (or  a  single  (uple  of 
values  it)  the  ease  of  a  C'artesiaii  product  of  value  sets). 

Note  that  relationships  also  hav<>  attributes  ( 'onsider  the  relationship  set 
IMlOJKCr  VVOlMxKH  (i-'inure  'A).  The  attribute  IM';i{( 'KN'I'Af  M'l  Ol'  11 M  K, 
which  is  the  portion  of  time  a  particular  employee  is  (committed  to  a  particular 
projei't,  IS  an  attribute  delined  on  the  relationship  set  i'UO.II'.t 'T  WOK  K  I'dt,  It 
is  neither  an  attribute  of  IvVI  I'bOYI'M';  nor  an  attribute  of  I'UX  ).ll';( 'T,  since  its 
meaniuK  dependH  on  both  the  employee  and  project  involved.  The  concept  of 
attribute  of  relationship  is  important  in  understatidinK  the  semantics  of  <latii  ami 
in  dctermimtiK  the  fimclionid  dependencies  anions  dala. 

'2.2A  (Conceptual  Information  Structure.  We  an-  now  concerned  with  how  to 
ornniiize  the  inforni.ition  associated  with  entities  and  relationships  The  method 
proposed  in  this  pa[ier  is  to  separate  the  inf<irnialioii  about  entities  from  the  iiifor- 
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Fig.  2.  Attributes  defined  on  the  entity  set  PERSON 


nation  about  relationships.  We  shall  see  that  this  separation  is  useful  in  identifying 
functional  dependencies  among  data. 

Figure  4  illustrates  in  table  form  the  information  about  entities  in  an  entity  set. 
Each  row  of  values  is  related  to  the  same  entity,  and  each  column  is  related  to  a 
value  set  which,  in  turn,  is  related  to  an  attribute.  The  ordering  of  rows  and  columns 
is  insignificant. 

Figure  5  illustrates  information  about  relationships  in  a  relationship  set.  Note 
that  each  row  of  values  is  related  to  a  relationship  which  is  indicated  by  a  group 
of  entities,  each  having  a  specific  role  and  belonging  to  a  specific  entity  set. 

Note  that  Figures  4  and  2  (and  also  Figures  ;')  and  3)  are  different  forms  of  the 
same  information.  The  table  form  is  used  for  easily  relating  to  the  relational  model. 
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Fig.  3.  Attributes  defined  on  the  relationship  set  PKOJECT- WORKER 


2.3  Information  Structure  (Level  2) 

The  entities,  relationships,  and  values  at  level  1  (see  Figures  2- 5)  are  eoiieeptiial 
objects  in  our  minds  (i.e.  we  were  in  the  conceptual  realm  [1<S,  27]).  At  level  2, 
we  consider  representations  of  conceptual  objects.  We  assume  that  there  exist 
direct  representations  of  values.  In  the  following,  we  shall  describe  how  to  represent 
entities  and  relationships. 

2.3.1  Primary  Key.  In  Figure  2  the  values  of  attribute  EMPLOYEE-NO  can 
be  used  to  identify  entities  in  entity  set  EMPLOYEE  if  each  employee  has  a 
different  employee  number.  It  is  possible  that  more  than  one  attribute  is  needed 
to  identify  the  entities  in  an  entity  set.  It  is  also  possible  that  several  groups  of 
attributes  may  be  used  to  identify  entities.  Basically,  an  entity  key  is  a  group  of 
attributes  such  that  the  mapping  from  the  entity  set  to  the  corresponding  group 
of  value  sets  is  one-to-one.  If  we  cannot  find  such  one-to-one  mapping  on  available 
data,  or  if  simplicity  in  identifying  entities  is  desired,  we  may  define  an  artificial 
attribute  and  a  value  set  so  that  such  mapping  is  possible.   In  the  case  where 
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Fig.  4.   Information  about  entities  in  an  entity  set  (tat)le  form) 
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Fig.  6.  Representing  entities  by  values  (employee  numbers) 


several  keys  exist,  we  usually  choose  a  seniantically  meaningful  key  as  the  entity 
primary  key  (PK). 

P'igure  6  is  obtained  by  merging  the  entity  set  EMPLOYEE  with  value  set 
EMPLOYEE- NO  in  Figure  2.  We  should  notice  some  semantic  implications  of 
Figure  6.  Each  value  in  the  value  set  EMPLOYEE-NO  represents  an  entity 
(employee).  Attributes  map  from  the  value  set  EMPLOYEE-XO  to  other  value 
sets.  Also  note  that  the  attribute  EMPLOYEE-NO  maps  froni  the  value  set 
EMPLOYEE-NO  to  itself. 

2.3.2  Entity/Relationship  Relations.  Information  about  entities  in  an  entity 
set  can  now  be  organized  in  a  form  shown  in  Figure  7.  Note  that  Figure  7  is  similar 
to  Figure  4  except  that  entities  are  represented  by  the  values  of  their  primary 
keys.  The  whole  table  in  Figure  7  is  an  entity  relation,  and  each  row  is  an  entity 
tuple. 

Since  a  relationship  is  identified  by  the  involved  entities,  tlie  primary  key  of  a 
relationship  can  be  represented  by  the  primary  keys  of  the  involved  entities.  In 
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Fig.  7.   Hegular  eiitily  relftlioii  EMPLOVi:!-; 

Figure  S,  tho  involved  ontitit's  are  represented  by  their  piiinaiy  keys  l'].Ml'L()\'i;i';- 
XO  and  PROJECT  X().  The  role  names  provide  tlie  sernanti<'  meaning  for  the 
values  in  the  eorresponding  (•(liunins.  Note  that  Jv\IPLU\'l\l'>N()  is  the  |)rinuir\' 
key  for  the  involved  entities  in  the  relationship  and  is  not  an  attribute  of  the 
relationship.  PERCEXTACIE-OF-TLMK  is  an  attribute  of  the  relationship.  The 
table  in  Figure  8  is  a  relationship  relation,  and  each  row  of  values  is  a  relationship 
tuple. 

In  certain  cases,  the  entities  in  an  entity  set  cannot  be  uniquely  identified  by 
the  values  of  their  own  attributes;  thus  we  must  use  a  relationship(s)  to  identify 
them.  For  example,  consider  dependents  of  employees:  dependents  are  identified 
by  their  names  and  by  the  values  of  the  primary  key  of  the  employees  supporting 
them   (i.e.   by   their  relationships  with  the  employees).   Note  that  in  Figure  9, 


ENTITY  RELATION 
NAME 
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ENTITY 
ATTRIBUTE 

VALUE  SET 
(DOMAIN) 

RELATIONSHIP 
TUPLE 


PRIMARY 

KEY 

1 

EMPLOYEE 

PROJECT 

WORKER 

PROJECT 

I 
1 

EMPLOYEE-NO 

PROJECT- NO 

PERCENTAGE- 
OF-TIME 

t 

UJ 

> 

-J 
< 

EMPLOYEE-NO 

PROJECT-NO 

PERCENTAGE 

2566 

31 

20 

o 

1- 
< 

2173 

25 

100 

or 

• 
• 
• 

• 
• 

• 
• 
• 

RELATIONSHIP 
ATTRIBUTE 


Fig.  8.   Regular  relation.ship  rchition  PltO.JKCT-WORKER 
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ENTITY 
RELATION  NAME 

ROLE 

ENTITY 
ATTRIBUTE 

VALUE  SET 
(DOMAIN) 

ENTITY 
TUPLE 


PRIMARY 


KE 

Y 

EMPLOYEE 

SUPPORTER 

1 

1 
1 

EMPLOYEE-NO 

NAME 

AGE 

t 

UJ 

EMPLOYEE-NO 

FIRST-NAME 

NO-OF-YEARS 

> 

< 

2566 

VICTOR 

3 

g 
< 

2173 

GEORGE 

6 

UJ 

• 
• 
• 

• 
• 
• 

• 
• 
• 

RELATIONSHIP 
ATTRIBUTE 


RELATIONSHIP 
ATTRIBUTE 


Fig.  9.  A  weak  entity  relation  DEPENDENT 


EMPLOYEE-NO  is  not  an  attribute  of  an  entity  in  the  set  DEPENDENT  but 
is  the  primary  key  of  the  employees  who  support  dependents.  Eaeh  row  of  values 
in  Figure  \)  is  an  entity  tuple  with  EMPLOYEE  NO  and  NAME  as  its  primary 
key.  The  whole  table  is  an  entity  relation. 

Theoretically,  any  kind  of  relationship  may  be  used  to  identify  entities.  For 
simplieity,  we  shall  restrict  (jurselves  to  the  use  of  only  one  kind  of  relationship: 
the  binary  relationships  with  l:/(  mapping  in  whieh  the  existence  of  the  n  entities 
on  one  side  of  the  relationship  depends  on  the  existence  of  one  entity  on  the  other 
side  of  the  relationship.  For  example,  one  eniplcjyee  may  have  7i  (=  0,  \,  2,  .  .  .) 
dependents,  and  the  existence  of  the  dependents  depends  on  the  existence  of  the 
corresponding  employee. 

This  method  of  identification  of  entities  by  relationships  with  other  entities  can 
be  applied  recursively  until  the  entities  which  can  be  identified  by  their  own  at- 
tribute values  are  reached.  For  example,  the  primary  key  of  a  department  in  a 
company  may  consist  of  the  department  number  and  the  primary  key  of  the 
divisi(jn,  which  in  turn  consists  of  the  division  number  and  the  name  of  the  company. 

Therefore,  we  have  two  forms  of  entity  relations.  If  relationships  are  used  for 
identifying  the  entities,  we  shall  call  it  a  weak  entilij  relation  (Figure  9).  If  relation- 
ships are  not  used  for  identifying  the  entities,  we  shall  call  it  a  regular  entity  relation 
(Figure  7).  Similarly,  we  also  have  two  forms  of  relationship  relations.  If  all 
entities  in  the  relationsliip  are  identified  by  their  own  attribute  values,  we  shall 
call  it  a  regular  relationship  relation  (Figure  8).  If  some  entities  in  the  relationship 
are  identified  by  other  relationships,  we  shall  call  it  a  weak  relationsliip  relation. 
For  example,  any  relationships  between  DEPENDENT  entities  and  (jther  entities 
will  result  in  weak  relationship  relations,  since  a  DEPENDENT  entity  is  identified 
by  its  name  and  its  relationship  with  an  EMPLOYEE  entity.  The  distinction 
between  regular  (entity/relationship)  relations  and  weak  (entity/relationship) 
relations  will  be  useful  in  maintaining  data  integrity. 
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EMPLOYEE 


WORKER    /PROJECT- 
WORKER 


PROJECT 


PROJECT 


ENTITY  SET 


RELATIONSHIP 
SET 


ENTITY  SET 


Fig.  10.  A  simple  entity-relationship  diagram 

3.  ENTITY-RELATIONSHIP  DIAGRAM  AND  INCLUSION  OF  SEMANTICS  ll'< 
DATA  DESCRIPTION  AND  MANIPULATION 

3.1    System  Analysis  Using  the  Entity-Relationship  Diagram 

In  this  section  we  introduce  a  diagrammatic  technique  for  exhibit ing  entities  and 
relationslii[)s:  the  entity-relationship  diagram. 

Figure  10  illustrates  the  relationship  set  PROJECT-WORKKR  and  the  entity 
sets  EMPLOYEE  and  PROJECT  using  this  diagrammatic  tecluiitiue  Each  entity 
set  is  represented  by  a  rectangular  box,  and  each  relationship  .set  is  represented  by 
a  diamond-shaped  box.  The  fact  that  the  relationship  set  PROJECT-WORKER 
is  defined  on  the  entity  sets  EMPLOYEE  and  PROJECT  is  represented  by  the 
lines  connecting  the  rectangular  boxes.  The  roles  of  the  entities  in  the  relationship 
are  stated. 


DEPARTMENT 


EMPLOYEE 


<^MP-DEP^ 


DEPENDENT 


SUPPLIER 


M^XTPROJ-WORtOv^  N 


PROJ- 
.MANAGER 


PROJECT 


PART 


<^MPONENT> 


Fig.   U.  An  entity-relationship  iliagram  for  analy.sis  of  infonnalioii   in  a  manvif.icturing  firm 
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Figure  11  illustrates  a  more  complete  diagram  of  some  entity  sets  and  relationship 
sets  which  niif^ht  be  of  interest  t(j  a  manufacturing  company.  DEPARTMl^XT, 
EMPLOYEE,  DEPENDENT,  PROJECT,  SUPPLIER,  and  PART  are  entity 
sets.  DEPARTMENT  EMPLOYEE,  EMPLOYEE-DEPENDENT,  PROJECT- 
WORKER,  PROJECT  MANAGER,  SIPPLIER  PROJECT  PART,  PRO- 
JECT-PART, and  CONfPONENT  arc  relationship  sets.  The  COMPONENT 
relationship  describes  what  subparts  (and  quantities)  are  needed  in  making  super- 
parts.  The  meaning  of  the  other  relationshij)  sets  need  not  b^^  explained. 

Several  important  eharaeteristics  about  relationships  in  general  can  be  found  in 
Figure  11: 

( 1 )  A  relationship  set  may  be  defined  on  more  than  two  entity  sets.  For  example, 
the  SUPPLIER-PROJECT-PART  relationship  set  is  defined  on  three  entity  sets: 
SUPPLIER,  PROJECT,  and  PART. 

(2)  A  relationship  .set  may  be  defined  on  only  one  entity  set.  For  example,  the 
relationship  set  COMPONENT  is  defined  on  one  entity  set,  PART. 

(3)  There  may  be  more  than  one  relationship  set  defined  on  given  entity  sets. 
For    example,    the    relationship    sets    PROJECT  WORKER    and    PROJECT 
MANAGER  are  defined  on  the  entity  sets  PROJECT  and  EMPLOYEE. 

(4)  The  diagram  can  distinguish  between  l:*i,  m:n,  anr!  1:1  mappings.  The 
relationship  set  DEPARTMENT-EMPLOYEE  is  a  \:n  mapping,  that  is,  one 
department  may  have  n  (^J  =  0,  1,  2,  .  .  .)  emploj'ees  and  each  employee  works  for 
only  one  department.  The  relationship  set  PROJECT- WORKER  is  an  ?»:/i 
mapping,  that  is,  each  project  may  have  zero,  one,  or  more  employees  assigned  to 
it  and  each  employee  may  be  a.ssigned  to  zero,  one,  or  more  projects.  It  is  also 
possible  to  express  1:1  mappings  such  as  the  relationship  set  MARRIAGE.  Infor- 
mation about  the  number  of  entities  in  each  entity  set  which  is  allowed  in  a  relati(jn- 
ship  set  is  indicated  by  specifying  "1",  "m",  "n"  in  the  diagram.  The  relational 
model  and  the  entity  set  modeFdo  not  include  this  type  of  information;  the  network 
model  cannot  express  a  1 : 1  mapping  easily. 

(5)  The  diagram  can  express  the  existence  dependency  of  one  entity  tj'pe  on 
another.  For  example,  the  arrow  in  the  relationship  set  EMPLOYEE-DEPEND- 
ENT indicates  that  existence  of  an  entity  in  the  entity  set  DEPENDENT  de- 
pends on  the  corresponding  entity  in  the  entity  set  EMPLOYEE.  That  is,  if  an 
employee  leaves  the  cf)mpany,  his  dependents  may  no  longer  be  of  interest. 

Note  that  the  entity  set  DEPENDENT  is  shown  as  a  special  rectangular  box. 
This  indicates  that  at  level  2  the  information  about  entities  in  this  set  is  organized 
as  a  weak  entity  relation  (using  the  primary  key  of  EMPLOYEE  as  a  part  of  its 
primary  key). 

3.2  An  Example  of  a  Database  Design  and  Description 

There  are  four  steps  in  designing  a  database  using  the  entity-relationship  model: 
(1)  identify  the  entity  sets  and  the  relationship  sets  of  interest;  (2)  identify 
semantic  information  in  the  relationship  sets  such  as  whether  a  certain  relationship 


'  This  mapping  information  is  included  in  DIAM  II  [24]. 
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set  is  an  l:n  mapping;  (3)  define  the  value  sets  and  attributes;  (4)  organize  data 
into  entity  relationship  relati(jns  and  decide  primary  keys. 

Let  us  use  the  manufacturing  company  discussed  in  Section  3.1  as  an  examjile. 
The  results  of  the  first  two  steps  of  database  design  are  expressed  in  an  entity- 
relationship  diagram  as  shown  in  Figure  11.  The  third  step  is  to  define  value  sets 
and  attributes  (see  Figures  2  and  3).  The  fourth  step  is  to  decide  the  primary 
keys  for  the  entities  and  the  relationships  and  to  organize  data  as  entity/relation- 
ship relations.  Xote  that  each  entity/relationship  set  in  Figure  11  has  a  corre- 
sponding entity  relationship  relation.  We  shall  use  the  names  of  the  entity  sets 
(at  level  1)  as  the  names  of  the  corresponding  entity/relationship  relations  (at 
level  2)  as  long  as  no  confusion  will  result. 

At  the  end  of  the  section,  we  illustrate  a  schema  (data  definition)  for  a  .>mall 
part  of  the  database  in  tin-  above  manufacturing  coni|)any  example  (the  syntax 
of  the  data  definition  is  not  important).  Note  that  value  sets  are  (lefiiicii  with 
specifications  of  representations  and  allowable  values.  For  example,  values  in 
EMPL(^YEE-XO  are  represented  as  4-digit  integers  and  range  from  0  to  2000. 
We  then  declare  three  entity  relations:  EMPLOYEE,  PROJECT,  and  DE- 
PEXDEXT.  The  attributes  and  value  sets  defined  on  the  entity  setj  as  well  as 
the  primary  keys  are  stated.  DEPEXDEXT  is  a  weak  entity  relation  since  it  uses 
EMPLOVEE.PK  as  part  of  its  primary  key.  We  also  declare  two  relationship 
relations:  PROJECT -WORKER  and  EMPLOYEE-DEPEXDEXT.  The  roles 
and  involved  entities  in  the  relationships  are  specified.  We  use  EMPIjOYEE.PK 
to  mdicate  the  name  of  the  entity  relation  (EMPLOYEE)  and  whatever  attribute- 
value-set  pairs  are  used  as  the  primary  keys  in  that  entity  relation.  The  maximum 
number  of  entities  from  an  entity  set  in  a  relation  is  stated.  For  example,  PR(3JECT- 
WORKER  is  an  m:n  mapping.  We  may  specify  the  values  of  ni  and  n.  We  may 
also  specify  the  minimum  number  of  entities  in  addition  to  the  maximum  number. 
EMPLOYEE-DEPEXDEXT  is  a  weak  relationship  relation  since  jne  of  the 
related  entity  relations,  DEPEXDEXT,  is  a  weak  entity  relation.  No^e  that  the 
existence  dependence  of  the  dependents  on  the  supporter  is  also  stated. 


DECLARE 

VALUE-SETS 
EMPLOYEE-NO 

REPRESENTATION 

ALLOWA'iLE-VALUES 

INTEGER  (4) 

(0,2000) 

FIRST-NAME 

CHARACTER  (8) 

ALL 

LAST-NAME 

CHARACTER  (10) 

ALL 

NO-OF-YEARS 

INTEGER  (3) 

(0,100) 

PROJECT-NO 

INTEGER  (3) 

(1,500) 

PERCENTAGE 

FIXED  (5.2) 

(0,100.00) 

DECLARE 

REGULAR  ENTITY  RELATION  EMPLOYEE 

ATTRIBUTE/VALUE-SET: 

EMPLOYEE-NO/EMPLOYEE-NO 
NAME/(FIRST-NAME,  LAST-NAME) 
ALTERNATIVE-NAME/(FIRST-NAME,LAST-NAMF) 
AGE/NO-OF-YEARS 
PRIMARY  KEY: 

EMPLOYEE-NO 


ACM  TrminactionB  od  Databaoe  SyBtcnu,  Vol.  1,  No.  1.  March  1976. 


22 


P.  P.-S.  Chen 


DECLARE 


DECLARK 


DECLARE 


DECLARE 


REGULAR  ENTITY  RELATION  PROJECT 
ATTHIBUTE/VALUE-SET : 


PROJECT-NO/PHOJECT-NO 
PRIMARY  KEY; 


PROJECT-NO 


REGULAR   RELATIONSHIP  RELATION  PROJECT-WORKER 


ROLE/ENTITY-RELATION. PK/MAX-NO-OF-ENTITIE.S 
W0RKEl{7l<;MPL0YEE.PK/m 
PROJECT/ PROJECT.PK/n         (m;n  mapping) 

ATTRIBUTE/VALUE-SET: 

PERCENTAGE-OF-TIME/PERCENTAGE 

WEAK  RELATIONSHIP  RELATION  EMPLOYEE-DEPENDENT 


ROLE/ENTITY-RELATION. PKM  AX-NO-OF-ENTITIES 
SUPPORTER/EMPLOYEE.Pk/I 

DEPENDENT/DEPENDENT.PK 'n 
EXISTENCE  OF  DEPENDENT  DEPENDS  ON 


EXISTENCE  OF  SUPPORTER 


WEAK  ENTITY  RELATION  DEPENDENT 


ATTR I  BUTE  /VALUE-SET : 

NAME/FIRST-NAME 
AGE/NO-OF-YEARS 
PRIMARY  KEY: 


NAME 

EMPLOYEE.PK  THROUGH  EMPLOYEE-DEPENDENT 


3.3   Implications  on  Data  Integrity 

Some  work  has  been  done  on  data  integrity  for  other  models  [8,  14,  16,  28].  With 
explicit  concept.s  of  entity  arul  relationship,  the  entity-relationship  model  will  be 
useful  in  understanding  and  specifying  constraints  for  maintaining  data  integrity. 
For  example,  there  are  three  major  kinds  of  constraints  on  values: 

(1)  Constraints  on  allowable,  values  for  a  value  set.  This  point  was  diseu.ssed  in 
defining  the  schema  in  Section  3.2. 

(2)  Constraints  on  perjuitled  values  for  a  certain  attribute.  In  some  cases,  not 
all  allowable  values  in  a  value  set  are  permitted  for  some  attributes.  For  example, 
we  may  have  a  restriction  of  ages  of  employees  to  between  20  and  65.  That  is, 

AGE(el  e  (20,05),  where  e  €  EMPLOYEE. 

Note  that  we  use  the  level  1  notations  to  clarify  the  semantics.  Since  each  entity/ 
relationship  set  has  a  corresponding  entity/relationship  relaticjn,  the  above  expres- 
sion can  be  easily  translated  into  level  2  notations. 

(3)  Constraints  on  existing  values  in  the  database.  There  are  two  types  of 
constraints: 

(i)    Constraints  between  sets  of  existing  values.  For  example, 

|NAME(t)  I  e  f  MALE -PERSON  I  C  jNAME(e)  |  e  £  PERSON). 
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(ii)   Constraints  between  particular  values.  For  example, 

TAX(e)  <  SALARY (f),  e  €  EMPLOYEE 

or 
BUDGET(€,)  =X;BUDGET(f;),  where  f.  €  COMPANY 

e,  t  DEPARTMENT 
and  Ce„f,]  t  COMPANY-DEPARTMENT. 

3.4   Semantics  and  Set  Operations  of  Information  Retrieval  Requests 

The  senianties  of  information  retrieval  requests  beeonie  ver\-  elear  if  the  requests 
are  based  on  the  etitity-relationship  model  of  data.  For  clarit}-,  we  first  diseu.ss 
the  situation  at  levt^l  1.  C'oneeptually,  the  information  elcmrnts  aif  orgamzed  as 
in  Figures  4  and  5  (on  Figures  2  and  3).  ^lany  information  retrieval  rec(Uests  ean 
be  considered  as  a  combination  of  the  following  ba.sic  types  of  <ipcrations: 
(.1)   Selection  of  a  subset  of  values  from  a  value  set. 

(2)  Selection  of  a  subset  of  entities  from  an  entity  set  (i.e.  selection  of  certain 
rows  in  Figure  4).  Entities  are  selected  b>'  stating  the  values  of  certain  attributes 
(i.e.  subsets  of  value  .sets)  and  or  their  relationships  \Mth  other  entities. 

(3)  Selection  of  a  subset  of  relationsliips  from  a  relationshij)  set  li.e.  .selection 
of  certain  rows  in  Figure  .')).  Relationships  are  selected  by  stating  *hc  values  of 
certain  attribute(s)  and  or  by  identifying  certain  entities  in  the  relaiionship. 

(4)  Selection  of  a  subset  of  attributes  (i.e.  .selection  of  columns  in  l-'ignres  4 
and  5). 

An  information  retrieval  re(|uest  like  "What  are  the  ages  of  the  eni[ilo\ces  whose 
weights  are  greater  than  170  and  who  art'  assigned  to  the  project  with  PKO.II'X'T- 
NO  254?"'  can  be  expressed  as: 

|AGE(e)  1  e  e  EMPLOYEE,  WEIGIlT(e)  >  170, 
[e,  e,]  €  PROJECT-WORKER,  e,  €  PROJECT, 
PROJECT-NO  (e,)  =2541; 

or 

jAGE(EMPLOYEE)  |  WEIGHT(EMPLOYEE)  >  170, 
[EMPLOYEE,PROJECT]  P  PROJECT-WORKER, 
PROJECT-NO(EMPLOYEE)  =  L'54). 

To  retrieve  information  as  organized  in  Figure  6  at  level  2,  "entities"  and 
"relationships"  in  (2)  and  (3)  should  be  replaced  by  "entity  PK"  and  'relationship 
PK."  The  above  information  retrieval  request  can  be  expressed  as: 

lAGE(EMPLOYEE.PK)  [  WEIGHT(EMPLOYEE.PK)  >  170 

(WORKEH/EMPLOYEK.PK.PROJECT/PROJECT.PK)  t  | PROJECT-WORKER. PK|, 
PROJECT-NO  (PR(XJECT.PK)  =  254|. 

To  retrieve  information  as  organized  in  entit\-  relationship  relations  'Figures  7, 
8,  and  9),  we  can  express  it  in  a  SEyLKL-like  language  [(>]: 

SELECT  AGE 

FROM  EMPLOYEE 

WHERE  WEIGHT  >  170 
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Table  I.  Insertion 


level  1 

level  2 

operation: 

operation: 

insert  an  entity  to  an  entity  set 

create  an  entity  tuple  with  a  certain  entily-PK 

check: 

whether  PK  already  exists  or  is  acreptable 

operalion: 

operation: 

insert  a  relationship  in  a  relationship  set 

create  u  relationship  tuple  with  given  entity 

PKs 

check: 

check: 

whether  the  entities  exist 

whether  the  entity  PKs  exist 

operation: 

operation: 

insert  properties  of  an  entity  or  a  rplalion^hip 

insert  values  in  an  entity  tuple  or  a  relation- 

ship tuple 

check: 

check: 

whether  the  value  is  acceptahle 

whether  the  v:ilues  are  aceeptaMe 

AND 


EMPLOYEE. PK  = 

SELECT         WORKEH/EMPLOYEE.PK 
FHOM  PHUJECT-WOUKER 

WHERE         PR(JJECT-NO  =  254. 


It  is  possible  to  retrieve  informatinii  alxiut  entities  in  two  difTereiit  entity  sets 
without  specifying  a  relationsliiii  between  them.  For  exani|)le,  an  information 
retrieval  request  like  "List  the  names  of  employee.s  and  ships  which  have  the  same 


Table  IL  Updating 


level  1 

level  2 

operation: 

•  change  the  value  of  an  entity  attribute 

operation: 

•  update  a  value 
consequence: 

•  if  it  is  not  part  of  an  entity  PK,  no  conse- 
quence 

•  if  it  is  part  of  an  entity  PK, 

•  •  change   the  entity   PKs   in   all   related 

relation.ship  relations 

•  •  (hange  PKs  of  other  entities  which  use 

this   value   ius   part   of   their    I'Ks    (for 
example,     1  >EPEN  D.'ONTS'    PKs    use 
EMPLOYEE'S  PK) 

operatio?i: 

•  change  the  vahie  of  a  nlalKjiiship  altribulc 

optro'.ton: 

•  uiidali    a   value    (hole    thai    a   rclalionship 
attribute  will  not  be  a  relationship  PK) 
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level  1 

level  2 

operation: 

operation: 

•  delete  an  entity 

•  delete  an  entity  tuple 

consequences: 

consequences  (applied  recursively;: 

•  delete  any  entity  whose  existence  depends 

•  delete  any  entity  tuple  whose  existence  de- 

on this  entity 

pends  on  this  entity  Iviple 

•  delete  relationships  involving  this  entity 

•  delete   relationship   tuples   associated   with 

•  delete  all  related  properties 

this  entity 

operation: 

operation: 

•  delete  a  relationship 

•  delete  a  relationship  tuple 

consequences: 

•  delete  all  related  properties 

age"  can  be  expressed  in  the  level  1  notation  as: 

|(NAME(e,),NAME(e,))  |  e,  €  EMPLOYEE.e,  €  SHIP,  AGE(e.)  =  AGE(e,)|. 

We  do  not  further  discuss  the  language  syntax  here.  What  we  wish  to  stress  is 
that  information  requests  may  be  expressed  using  set  notions  and  set  operations 
[17],  and  the  request  semantics  are  very  clear  in  adopting  this  point  of  view. 

3.5  Semantics  and  Rules  for  Insertion,  Deletion,  and  Updating 

It  is  always  a  difficult  problem  to  maintain  data  consistency  following  insertion, 
deletion,  and  updating  of  data  in  the  database.  One  of  the  major  reasons  is  that 
the  semantics  and  consequences  of  insertion,  deletion,  and  updating  operations 
usually  are  not  clearly  dehned;  thus  it  is  difficult  to  find  a  set  of  rules  which  can 
enforce  data  consistency.  We  shall  see  that  this  data  consistency  problem  becomes 
simpler  using  the  entity-relationship  model. 

In  Tables  I-III,  we  discuss  the  semantics  and  rules'  for  insertion,  deletion,  and 
updating  at  both  level  1  and  level  2.  Level  1  is  used  to  clarify  the  semantics. 

4.  ANALYSIS  OF  OTHER  DATA  MODELS  AND  THEIR  DERIVATION  FROM  THE 
ENTITY-REUTIONSHIP  MODEL 

4.1    The  Relational  Model 

4.1.1  The  Relational  View  of  Data  and  .\mbiguity  in  Semantics.  In  the  re- 
lational model,  relation,  R.  is  a  mathematical  relation  defined  on  sets  A'l,  A'o.  .  .  .  , 
X„: 

R  =  l(j-i,  /2,  •  .  .  ,  Xn)  I  xx  6  Xu  X2  €  Xj,  .  .  .  ,  /„  €  AJ. 

The  sets  Xi,  Xj,  .  .  .  ,  X„  are  called  domains,  and  (j-,,  x-i,  .  .  .  ,  j-„)  is  called  a  tuple. 
Figure  12  illustrates  a  relation  called  EMPLOYEE.  The  domains  in  the  relation 


'Our  main  purpose  is  to  illustrate  the  semantics  of  data  manipulation  operations.  Therefore, 
these  rules  may  not  be  complete.  Note  that  the  consequence  of  operations  stated  in  the  tables 
can  be  performed  by  the  system  instead  of  by  the  users. 
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ROLE 


DOMAIN 


TUPLE 


LEGAL 

LEGAL 

ALTERNATIVE 

ALTERNATIVE 

EMPLOYEE - 
NO 

FIRST- 
NAME 

LAST- 
NAME 

FIRST- 
NAME 

LAST- 
NAME 

NO-OF- 
YEARS 

2566 

PETER 

JONES 

SAM 

JONES 

25 

3378 

MARY 

CHEN 

BARB 

CHEN 

23 

Fig.  12.   liel;ilini,  EMPLOVEr: 

are  EA[PLOYEK  XO,  FIRST-XA.MK,  L.\ST  XA^rE,  FIRST- XAME,  LAST 
X'AME,  XO  OF-^'E.\.R.  The  (ndcriii;!:  of  ii)\\s  and  i-oliiiiins  in  the  rrhitiini  Ills 
no  significanco.  T(j  avoid  ambiguity  of  columns  witii  the  same  domain  in  a  relation, 
domain  names  are  C[ualified  by  roles  (to  distinguish  the  role  of  the  domain  in  the 
relation!.  For  example,  in  relation  EMPLOYEE,  domaiiLs  F'JRST-XAME  and 
LAST-XAME  may  he  qualified  by  roles  LEGAL  or  ALTERXAITVIC.  An  atlnbulr 
name  in  the  relational  model  is  a  domain  name  concatenated  with  a  role  name  [^10]]. 
Comparing  Figure  12  with  Figure  7,  we  can  see  that  "domains"  are  basically  ('(juiva- 
lent  to  value  sets.  Although  "role"  or  "attribute"  in  the  relati(Hial  model  seems  to 
serve  the  same  purijose  as  "attribute"  in  the  entity-relationship  model,  the  se- 
mantics of  these  terms  are  different.  The  "role"  or  "attribute"  in  the  relational 
model  is  mainly  used  to  distinguish  domains  with  the  same  name  in  the  same 
relation,  while  "attribute"  in  the  entity-relationship  model  is  a  function  which 
maps  from  an  entity  (or  relationsliip)  set  into  value  sct(s). 

Using  relational  operators  in  the  relational  model  may  cause  semantic  ambi- 
guities. For  example,  llu'  join  of  the  relation  EMPLOM']]'^  witli  the  relation 
EMPLOYEE-PROJECT  (Figure  IH)  on  domain  EMPLOYEE -X()  produces  the 


PROJECT- NO 

EMPLOYEE-NO 

7 

2566 

3 

2566 

7 

3378 

Fig.  13.  Kelalioii  KMPLOYEE-PHOJKCT 
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LEGAL 

LEGAL 

ALTERNATIVE 

ALTERNATIVE 

PROJECT- 
NO 

EMPLOYEE- 
NO 

FIRST- 
NAME 

LAST- 
NAME 

FIRST- 
NAME 

LAST- 
NAME 

NO-OF- 
YEARS 

7 

2566 

PETER 

JONES 

SAM 

JONES 

25 

3 

2566 

PETER 

JONES 

SAM 

JONES 

25 

7 

3378 

MARY 

CHEN 

BARB 

CHEN 

23 

Fig.  14.  Relation  EMPLOYEE-PROJECT'  as  a  "join"  of  relations  EMPLOYEE  and 

EMPLOYEE-PROJECT 

relation  EMPL()YP:E-PR0JECT'  (FiKure  14).  Rut  what  is  the  meaning  uf  a 
join  between  the  relation  EMPLOYEE  with  I  lie  relation  8H1P  on  the  domain 
NO-OF- YEARS  (Figure  15)?  The  problem  i.s  that  the  same  domain  name  may 
have  different  semantics  in  different  relations  (note  that  a  role  is  intended  to  dis- 
tinguish domains  in  a  given  relation,  not  in  all  relations).  If  the  domain  XO-OF- 
YEAR  of  the  relation  EMPLOYEFJ  is  not  allowed  to  be  compared  with  the  domain 
NO-OF- YEAR  of  the  relation  SHIP,  different  domain  names  have  to  be  declared. 
But  if  such  a  comparison  is  acceptable,  can  the  database  system  warn  the  user? 

In  the  entity-relationship  model,  the  semantics  of  data  are  much  more  apparent. 
For  example,  one  column  in  the  example  stated  above  contains  the  values  of  AGE 
of  EMPLOYEE  and  the  other  colunui  contains  the  values  of  AGE  of  SHIP.  If 
this  semantic  information  is  exposed  to  the  user,  he  may  operate  more  cautiously 
(refer  to  the  sample  information  retrieval  requests  stated  in  Section  3.4).  Since 
the  database  sj'stem  contains  the  semantic  information,  it  should  be  able  to  warn 
the  user  of  the  potential  problems  for  a  proposed  "join-like"  operation. 

4.1.2  Semantics  of  Functional  Dependencies  Among  Data.  In  the  relational 
model,  "attribute"  B  of  a  relation  \ii  fuiictiunally  dependent  on  "attribute"  A  of  the 
same  relation  if  each  value  of  A  has  no  more  than  one  value  of  B  associated  with 
it  in  the  relation.  Semantics  of  functional  dependencies  among  data  become  clear 


SHIP-NO 

NAME 

NO-OF-YEARS 

037 

MISSOURI 

25 

056 

VIRGINIA 

10 

Fig.  1.5.  Relation  SHIP 
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^ini^  ^""Z'Ti  '^'P'f ''''''''  '•^•^t'^d  to  description  of  entities  or  relationships 
femce  an  attribute  is  defined  as  a  function,  it  maps  an  entity  in  an  entity  set  to  a 
single  value  in  a  value  set  (see  Figure  2).  At  level  2,  the  values  of  the  pr  mlrv  kev 
are  used  to  represent  entities.  Therefore,  nonkey  value  sets  (domains)  Trefunc' 

NO-OF  vHrI -ir"  'r7'^'  ^f  ^  ^^'^  ^'''  ^^'^"^'^'^'  '"  Figures  Gand"7, 
AU  Uf  \  EARb  ,s  functionally  dependent  on  EMPLOYEE-NO)  Since  a  relation 
may  have  several  keys,  the  nonkey  value  sets  will  functionally  d  pen'on  a'k  , 
value  set  The  key  value  sets  will  be  mutually  functional!;  dependen"  on  each 
other.  Similarly,  in  a  relationship  relation  the  nonkey  value  sets  wi  1  be  u  ictLal  v 
dependent  on  the  prime-key  value  sets  (for  example,  in  Figure  8,  PERtTVTAGF 
IS  functionally  dependent  <.n  EMPLOVEE-XO  and  PROJECT  XO)        ^" 

(2)  Functional  dependencies  related  to  entities  in  a  reiation.ship.  Xute  tlc.t 
m  Figure  11  we  identify  the  types  of  mappings  (1:.,  .,:.,  etc.)  fo  relation  p 
sets,  l-or  exampe,  PROJECT-\r\\'Ar;PR  i<  ■>  1  •,  ■        t    .     '""i"""snip 

PROTFrT  vn  i    ih  -^  -^-NAt.hK  IS  a  l.;(  mapping.  Let  us  assume  that 

PROJECT -XO  IS  the  primary  key  in  the  entity  relation  PROJECT  In  the  re- 
lationship re  ation  PROJECT-MANAGER,  the  value  .set  EMPLOYEE-NO  Ji  1 
be  functionally  dependent  on  the  value  set  PROJECT  NO  (i  e  ea.'h  nroiect  has 
only  one  manager).  f  "j^^ti  u.i.-. 

The  distmction  between  level  1  (Figure  2)  and  level  2  (Figures  0  and  7)  and 
he  separation  of  entity  relation  (Figure  7)  from  relationship  -elation  (Figure  8) 
clarifies  the  semantics  of  functional  dependencies  among  data 

of  "rebt^"''^'^''"''""'  Versus  Entity-Relationship  Relations.  From  the  definition 
of  relatu  n,  any  grouping  of  domains  can  be  considered  to  be  a  relation.  To  avoid 
undesirable  properties  in  mamtaming  relations,  a  normalization  process  is  proposed 
to  transform  arbitrary  relations  into  the  first  normal  form,  then  into  the  second 
normal  form,  and  finally  into  the  third  normal  form  (3NF)  [9  111  We  shall 
show  that  the  entity  and  relati<,nship  relations  in  the  entity-relationship  model 
are  similar  to  3XF  relations  but  with  clearer  semantics  and  without  u'ng  the 
transformation  operation.  ^ 

Let  us  use  a  simplified  version  of  an  example  of  normalization  described  in  [9] 
The  following  three  relations  are  in  first  normal  form  (that  is,  there  is  no  doniain 
whose  elements  are  themselves  relations)  : 

EMPLOYEE  (EMPLOYEE-NO) 

PAKT  (PART-NO,  PAHT-DESCRIPTION.  QUANTITY-ON-HAND) 

PART-PROJECT  (PAKT-NO,  PROJECT-NO,  PROJECT-DESCRIPTION, 

PROJECT-MANAGER-NO,  QUANTITY-COMMITTED). 

FMPmvp^  x-M  ''r:''"    P.I^f^JECT-MANAGER-NO   actually    contains    the- 
ai^  uilderlined  '  '"'"'  '"'"''"•  '"  '''"  '"''''''''  ^'"^^•'  •^'•'"^^'•>-  '^^■-^ 

form'^'''"  '"'""'  ""■'"  ''P^'''''^  *"  transform  the  relations  above  into  third  normal 

EMPLOYEE'  (EMPLOYEE-NO) 

PART-  (PARjr-NO,  PART^^ESCRIPTION,  QUANTITY-ON-HAND) 
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PROJECT'  (PROJECT-NO,  PROJECT-DESCRIPTION,  PROJECT-MANAGEli-NO; 
PART-PROJECT'  (PART-NO,  PROJECT-NO,  QUANTITY-COMMITTED;. 

Using  the  entity-relationship  diagram  in  Figure  11,  the  following  entity  and 
relationship  relations  can  be  easily  derived : 

entity  PART' '  (PART-NO,  PART-DESCRIPTION,  QUANTITY-ON-HAND) 

relations  PROJECT' '  (PROJECT-NO,  PROJECT-DESCRIPTION) 

EMPLOYEE  '  '(EMPLOYEE-NO) 

relationship         PART-PROJECT' '  (PART/PART-NO,  PROJECT/PROJECT-NO, 
relations  QUANTITY-COMMITTED) 

PROJECT-MANAGER' '  (PROJECT/PROJECT-NO, 

MANAGER/EMPLOYEE-NO). 

The  role  names  of  the  entities  in  relationships  (such  as  MANACiER)  are  indicated. 
Tiu"  entity  relation  names  associated  with  the  PKs  of  entities  in  reiation.><hips  and 
the  value  set  names  have  been  f)mitted. 

Xote  that  in  the  example  above,  entity,  relationshiji  relatioi\s  are  similar  to  the 
3NF  relations.  In  the  3x\F  approach,  PROJECT-M  AXAGER-NO  is  included  in 
the  relation  PROJECT'  since  PROJECT-MANAGER-NO  is  assumed  to  be 
functionally  dependent  on  PROJECT-NO.  In  the  entity-relationship  model, 
PROJECT-MANAGER  NO  (i.e.  EMPLOYEE-NO  of  a  project  manager)  is 
included  in  a  relationship  relation  PROJECT-MANAGER  since  ExMPLOYEE-NO 
is  considered  as  an  entity  PK  in  this  case. 

Also  note  that  in  the  3NF  approach,  changes  in  functional  dependencies  of  data 
may  cause  some  relations  not  to  be  in  3NF.  For  example,  if  we  make  a  new  as- 
sumption that  one  project  may  have  more  than  one  manager,  the  relation 
PROJECT'  is  no  longer  a  3\F  relation  and  has  to  be  split  into  two  relations  as 
PROJECT"  and  PROJECT-MANAGER".  Using  the  entity-relationship  model, 
no  such  change  is  necessary.  Therefore,  we  may  say  that  by  using  the  entity- 
relationship  model  we  can  arrange  data  in  a  form  similar  to  3NF  relations  but  with 
clear  semantic  meaning. 

It  is  interesting  to  note  that  the  decomposition  (or  transformation)  approach 
described  above  for  normalization  of  relations  may  be  viewed  as  a  bottom-up 
approach  in  database  design.^  It  starts  with  arbitrary  relations  (level  3  in  Figure  1) 
and  then  uses  some  semantic  information  (functional  dependencies  of  data)  to 
transform  them  int(j  3NF  relations  (level  2  in  Figure  1).  The  entity-relationship 
model  adopts  a  top-down  approach,  utilizing  the  semantic  information  to  organize 
data  in  entity/relationship  relations. 

4.2  The  Network  Model 

4.2.1  Semantics  of  the  Data-Structure  Diagram.  One  of  the  best  ways  to  explain 
the  network  model  is  by  u.se  of  the  data-structure  diagram  [3].  Figure  16(a)  illus- 
trates a  data-structure  diagram.  Ivich  rectangular  box  represents  a  record  type. 


'Although  the  dooomposition  approach  wa.s  emphn-sized  in  the  relationiil  nwdcl  litcrulurc,  it  is 
a  procedure  to  obtain  3NF  and  may  not  l>o  an  intrinsic  properly  of  UNF. 

ACM  Traiouctiuiis  on  Datubaue  Systems,  Vol.  I.  No.  1,  Maroli  I'.*7C. 


30 


P.  P.-S.  Chen 
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(al  data  slnictiire  diagram 
(bl  entily-relalioiisliip  diagram 


Fig.     17.   lidaticM^liip    PliOJIX'T  WOltKF.U 

(a)  data  stniiliii"  diagram 

(b)  entity -relatiun.ship  diagram 


The  arruw  rt-pre.sciit.s  a  data-structurc-sL-t  in  wliich  tlie  DEPARTMENT  roc(jrd 
is  the  invncr-rcconi,  and  one  owner-roeord  may  own  n  in  =  0,  I,  '2,  .  .  .)  vicinber- 
reconh.  Figure  16(b)  ilhistratos  tlic  corresponding  ciitity-n'latioii.sliip  diagram. 
One  miglit  coiielude  that  the  arrow  in  the  datu-structure  diagram  n()rt'sents  a 
rehitionship  between  entities  in  two  entity  sets.  Tliis  is  not  always  true  I'igures 
17(a)  and  17(b)  are  the  data-strncture  diagram  and  the  entity-relationsliip  diagram 
expressing  tiie  relationship  PliO.IECT- WORKER  between  two  entity  types 
EMPLOYEE  and  PROJECT.  We  ran  see  in  Figure  17(a)  that  tlie  relationship 
PROJECT WORKER  becomes  another  record  type  and  tht>t  the  arrows  no 
longer  represent  relationships  between  entities.  What  are  the  real  meanings  of  the 
arrows  in  data-structure  diagrams?  The  answer  is  that  an  arrow  represents  an  \:n 
relationship  between  two  record  (not  entity)  types  and  also  implies  the  existence 
of  an  access  path  from  the  owner  record  to  the  member  records.  The  data-structure 
diagram  is  a  representation  of  the  organization  of  records  (level  4  in  Figure  1) 
and  is  not  an  exact  representation  of  entities  and  r<>lationships. 

4.2.2  Deriving  the  Data-Structure  Diagram.  I'nder  what  conditions  does  an 
arrow  in  a  data-structure  diagram  correspond  to  a  relationship  of  entities?  A  close 
comparison  of  the  data-structure  diagrams  with  the  corresponding  entity-relation- 
ship diagrams  reveals  the  following  rules; 

1.  For  1:«  binar>-  relationships  an  arrow  is  used  to  represent  the  relationship 
(see  Figure  16(a) ). 

2.  For  m:n  binary  relationshifjs  a  "rel.ationship  record"  type  is  created  to  repre- 
sent the  relationship  and  arrows  are  drawn  from  the  "entity  record"  type  to  the 
"relatioiishi|)  record"  type  (see  Figure  17(a)). 

3.  For  A;-ary  {k  >  :i)  relationships,  the  same  rule  as  (2)  appliis  (i.e.  creating  a 
"relationship  record"  type). 

Since  I)HT(i  [7]  does  not  allow  a  data-structurc-set  to  be  defined  on  a  single 
record  type  (i.e.  Figure  18  is  not  allowed  although  it  has  been  inii)lemented  in 
[13]),  a  "relationship  record"  is  needed  to  implement  such  relationships   (see 
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(a)  (b) 


PERSON 


PERSON 


\y 


Fig.    18.   Data-striirtiire-.set    de- 
fined oil  the  same  rei'ord  type 


HUSBAND 


MARRIAGE 


WIFE 


Fig.   10.   Rfl:itioii.-.liip  M.M<l!l.\r;r.  I:ii  d:ita  -irir 
ture  diagram  ^l)l  entity   ri'hitioii-,liiij  ilia(;raiii 


Figure  19(a))    [20].  'I'lie  currcsponding  Piitit\-relatioiisliip  diagrani  i-^  shown  in 
Figure  19(b). 

It  is  clear  now  tliat  arrows  in  a  data-strueturo  diagrani  do  not  aK\a>s  represent 
relatiousliijjs  of  entities.  I'lven  in  the  case  that  an  arrow  reiireseiils  a  1  ; /;  relation- 
ship, the  arrow  only  represents  a  unidirectional  relationship  ["JO]  i  although  it  is 
possible  to  find  tlie  owner-record  from  a  ineniher-record) .  In  the  entit\ -relationship 
model,  both  directions  of  the  relationship  are  represt'uted  (the  roles  of  both  en- 
tities are  specified),  liesides  the  semantic  ambiguity  in  its  arrows,  the  network. 
model  is  awkward  in  handling  changes  in  semantics.  For  exam|)le,  if  the  relationship 
between  DEPARTMENT  and  EMPLOYEE  changes  fn^ii  a  1:«  mapping  to  an 
m:n  mapping  (i.e.  one  employee  may  belong  to  several  de|)artments) ,  we  must 
create  a  relationship  record  DEPARTMENT-EMPLOYEE  in  the  network  model. 


DEPARTMENT 


EMPLOYEE 


PROJECT 


DEPENDENT 


PROJECT- 
WORKER 


SUPPLIER 


PART 


COMPONENT 


Fig.  20.  The  data  structure  diagram  derived  from  the  entity-relationship  diagram  in  Fig.   11 

ACM  Tmnsaotions  on  Database  Syetciiii'.  Vol    1,  .No    1,  Murcb  1070. 


32 


P.  P.-S.  Chen 


DEPT 


DEP 


EMP 


PROJ 


SUPP 


PART 


DEPT- 
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MAGR 
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COMP 


Fig.  21.  The  "disciplined"  data  structure  diagram  derived  from  tfie  eiitily-relatioii>liip  diagram 

in  Fig.  1 1 

In  the  entity-relationship  model,  all  kinds  of  mappings  in  relationships  are  handled 
uniformly. 

The  entity-rehitionsliip  model  can  be  used  as  a  tool  in  the  structured  design  of 
databases  using  the  network  model.  The  user  first  draws  an  entity-rt'lationship 
diagram  (Figure  11).  He  may  simply  translate  it  into  a  data-structure  diagram 
(Figure  20).  using  the  rules  specified  above.  He  may  also  follow  a  discipline  that 
every  entity  (jr  relationsliip  must  be  mapped  onto  a  record  (that  is,  "reiationshij) 
records"  are  created  for  all  types  of  relationships  no  matter  that  they  are  1 : /;  or 
m:n  mappings).  Thus,  in  Figure  11,  all  one  needs  to  do  is  to  change  the  diamonds 
to  boxes  and  to  add  arrowiieads  on  the  appropriate  lines.  Using  this  a()proach 
three  more  boxes-^DEPAHTMENT-EMPLOVEE,  EMPLOYEE~DI';PEN'D- 
E.\T,  and  PROJECT  MAXAtiER  -will  be  added  to  Figure  20  (see  Figure  21). 
The  validity  constraints  discussed  in  Sections  3.8-3.5  will  also  be  useful. 

4.3  The  Entity  Set  Model 

4.3.1  The  Entity  Set  View.  The  basic  element  of  the  entity  set  model  is  the 
entity.  Entities  have  names  {entili/  names)  such  as  "Peter  Jones",  "blue",  or 
'"22".  Entity  names  having  some  pro])ertie8  in  common  are  collected  into  an 
enlity-naine-ftct,  which  is  referenced  by  the  entUy-name-set-name  suth  as  "\AME", 
"COLOR",  and  "QUAXTITY". 

An  entity  is  represented  by  the  entity-name-set-name/cntity-name  pair  such  as 
XAME/ Peter  Jones,  EMPLOYEE-XO/25()(),  and  \O-OF-VEARS/20.  An  entity 
is  described  by  its  association  with  other  entities.  Figure  22  illustrates  the  entity 
set  view  of  data  The  "DEPARTMEXT"  of  entity  EMPLOYEE-  XO/256G  is  the 
entity  DEPARTMIOXT  .XO/40o.  In  other  words,  "DICPARTMEXT"  is  the  n.le 
that  the  entitv  DEPARTMEXT  XO/tO")  plays  to  describe  the  entity  IvM- 
PLOYEE  XO/2r.titi.  Similarly,  th.-  "XAME",  "ALTERXATIVl':  XAME",  or 
"A(;E"of  E.MPLOYEE  XO,  2a(lt)  is  "XAME/ I'eter  Jones",  "NA.ME  Sam  Jones", 
or  "X()-OF-YEARS/20",  respectively.  The  description  of  the  entity  EMPLOYEE- 
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XO/2566  i.s  a  collectioti  of  the  ri'latcd  ciititiu.s  and  their  lolcs  (tlic  entities  and 
roles  eireled  b\-  the  dotted  line).  An  example  of  the  fiilitij  (Icscrijilion  of  'KM- 
PLOVEE-\e)  2566"  lin  its  full-blown,  unfaetored  form)  is  illustrated  by  the  set 
of  role-nan\e  entity-name-set-iiame,'eiitity-iuime  triplets  shown  in  Figure  Xi.  Con- 
ceptually, the  entity  set  model  differs  from  the  entity-relationship  model  in  the 
following  ways: 

[1)  In  the  entity  set  model,  everything  is  treated  as  an  entity.  For  example, 
"COLOR  'RL.A.CK''  and  "XO-OF-YEARS/'45"  are  entities.  In  the  entity-relation- 
ship model,  "blue"  and  "36"  are  usually  treated  as  values.  Note  tr  -ating  values  as 
entities  may  cause  semantic  problems.  For  example,  in  Figure  ?'l,  what  is  the 
difference  between  "EMPLOYEE-XO/2566",  "XAME  Peter  J(jnes",  and 
"XAME  Sam  Jones"?  Do  they  represent  different  entities? 

(2)  Oidy  binary  relationships  are  used  in  the  entity  .set  model, ^  while  n-a.T\ 
relationships  may  be  used  in  the  entity-relationship  model. 


■- 1 


Q 


EMPLOYEE-NO/2566 


\DEPARTMENT 


I        ^EPARTMENT-NO/406) 


Fig.  22.  The  eiility-set  view 


» In  DIAM  II  [24],  (i-ary  relationships  mny  l>e  treated  a.s  special  ra.ses  of  identifiers. 
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THE  ENTITY- 
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ATTRIBUTE 
OR  ROLE 


VALUE  SET 


VALUE 


THE  ENTITY  SET 
MODEL  TERMINOLOGY 


ROLE-NAME 


ENTITY-NAME- 
SET-NAME" 


"EMTITY-NAME" 


IDENTIFIER 

EMPLOYEE- NO 

2566 

NAME 

NAME 

PETER  JONES 

NAME 

NAME 

SAM  JONES 

AGE 

NO-OF-YEARS 

25 

DEPARTMENT 

DEPARTMENT-NO 

405 

Fig.  23.  An  "entity  description"  in  the  entity-set  model 

4.3.2  Deriving  the  Entity  Set  View.  One  of  the  main  ditficultie.s  in  under- 
standing the  entit\'  .set  model  is  due  to  its  world  view  (i.e.  identifying  values  with 
entities).  The  entit> -relationship  model  propo.s<'d  in  this  i)aper  is  useful  in  inidcr- 
standing  and  deriving  the  entity  set  view  of  data.  Consider  Figures  2  and  (i.  In 
Figure  2,  entities  are  represented  by  f.'s  (which  exist  in  our  minds  or  are  pDinted 
at  with  fingers).  In  Figure  6,  entities  are  represented  by  values.  The  entity  set 
model  works  both  at  level  1  and  level  2,  but  we  shall  explain  its  view  at  level  2 
(Figure  6).  The  entity  .set  model  treats  all  value  sets  such  as  NO-()F  Y'EARS 
as  "entity-name-sets"  and  all  values  as  "entity-names."  The  attributes  become 
role  names  in  the  entity  set  model.  For  binary  relationships,  the  translation  is 
simple:  the  role  of  an  entity  in  a  relationship  (for  example,  the  role  of  "DEPART- 
MEXT"  ill  the  relationship  DEPAHT.MKXT -EMPLOYEE)  becomes  the  role 
name  of  the  entity  in  describing  the  other  entity  in  the  relationsliiiJ  (see  l^'igure 
22).  For  n-ary  (n  >  2)  relationships,  we  must  create  artificial  entities  f(jr  relation- 
ships in  order  to  handle  them  in  a  binary  relationship  world. 
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